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COMMITTEE PREFACE 



In 1989, the Technology Assessment Advisory Committee (TAAC) of the 
Commission on Preservation and Access was asked by the Commission to 
consider the potentials of various new technologies for the capture of 
printed and other information now at risk, and the storage and retrieval of 
preserved niaterials. This report is one in a series alerting the Commission 
and others to develcnments and possibilities within the context of 
national and international initiatives for preservation of and access to 
information^printed on disintegrated paper and other substrates. During 
its first meetings, the Committee found the need for a framework within 
which to discuss the use of emerging technologies for preservation 
purposes — a framework that could also be shared with professionals 
working in the preservaiion and related fields. 

The resulting "structured glossary", which represents the views and 
thinking of the full TAAC membership, was principally authored by 
M. Stuart Lynn with assistance from colleagues in the libraries and 
information technologies divisions at Cornell University. This paper has 
also been subjected to a pre-publication review by selected members of the 
library and information technologies professions at large. The Committee 
hopes that this Glossary will contribute to a common understanding of 
how preservation and access needs can be addressed by emerging 
technologies, in order to take full advantage of appropriate opportunities. 

Rowland Brown, Chair 

Technology Assessment Advisory Committee 



TAAC membership consists of representatives of the computer and 
communications industries, as well as corporate and higher education 
institutional consumers of advanced technologies. The members are: 
Adam Hodgkin, Managing Director, Cherwell Scientific Publishing 
Limited; Douglas van Houweling, Vice Provost for Informarion 
Technologies, University of Michigan; Michael Lesk, Division Manager, 
Computer Sciences Research, Bellcore; M. Stuart Lynn, Vice President for 
Information Technolofe-es, Cornell University; Robert Spinrad, Director, 
Corporate Technology, Xerox Corporation; Robert L. Street, Vice President 
for mformaHon Resources, Stanford University; and Rowland C.W. 
Brown, Chair, President, OCLC (retired). 
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FOREWORD 



This document is offered as a structured glossary of terms associated with 
the technologies of document preservation, with particular emphasis on 
document media conversion technologies (often called ''reformatting 
technologies"), and even more particularly the use of digital computer 
technologies* The Glossary also cor:siders technologies associated with 
access to such preserved documents. Such a glossary is intended for 
communication among people of different professional backgrounds, 
especially since in recent years there has been a proliferation of such 
technologies and associated technical terms, technologies and terms that 
cut across many disciplines* 

The use of digital technologies, however, has implications for libraries 
that extend far beyond the boundaries of preservation and of access to 
presen/ed materials. Some of these implications are summarized in the 
discission in the Introduction of "The Impact of Digital Technologies," 
and are indicated throughout the Glossary. Thus this Glossary may serve a 
wider purpose than the title itself would imply. 

The Glossary is a structured glossary, in the sense that the defii^ed terms 
have been hierarchically grouped. The term "taxonomy'^ was used to 
describe earlier drafts of the manuscript, but that term was dropped since it 
might imply a degree of completeness and form beyond that envisaged, or 
even possible. The Glossary is not intended to be complete with respect to 
preservation technology as a whole, but is highly selective (and even 
highly subjective) in its choice of terms to include, and very much slanted 
towards the use and impact of digital technologies. Other preservation 
technologies are sketched in for contextual purposes only. Within these 
constraints, the Glossary is intended to be comprehensive but not 
exhaustive. 

The Glossary is not intended to be so comprehensive as to satisfy the 
technologist only concerned with technologies, or the librarian exclusively 
concerned with librarianship and preservation. It is intended to satisfy the 
intersection of their concerns. On the other hand, issues of preservation 
and access raise concepts that have implications for librarianship as a 
whole, so that, in that sense, this Glossary has consequences that are not 
limited to the preservation arena alone. 
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INTRODUCTION 

This document is offered fs a structured glossary of terms associated with the 
technologies of document preservatiottf with particular emphasis on 
document media conversion technologies (often called ^Reformatting 
technologies*'),' and even more particularly on the use of digital computer 
technologies* The Glossary also considers technologies associated with access 
to such preserved documents. Such a glossary is intended for communication 
among people of different professional backgrounds, especially since in recent 
years there has been a proliferation of such technologies and associated 
technical terms, technologies and terms that cut across many discipiin^s** 

The use of digital technologies, however, has implications for libraries that 
extend far beyond the boundaries of preservation and of access to preserved 
materials* Some of these implications are summarized in the following 
discussion of *The Impact of Digital Technologies,'* and are indicated 
throughout the Glossary. Thus this Glossary may serve a wider purpose than 
the title itself would imply. 

The Impact of Digital Technolog ies 

The digital computer technology revolution continues to open up 
concepts, many of which are only just beginning to be understood or 
accepted. These concepts are critically important to librarianship in general 
and preservation in particular. In a world historically dominated by paper, 
the same medium is used for document capture (creation, recording). 



1 See Section 3.1 for a discussion of the use of the term "media conversion" to replace the 
use of the term "reformatting." We also follow the distinction that while media conversion 
is not a conserving technology, it is a preaervrng technology. 
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storage, access, distribution and use, and there has been no compelling 
need to consider these as separate entities. There has also been no 
compelling need to distinguish between the format of a document and the 
medium in which it is embodied, since there is only one dominant choice 
of medium. Indeed, the terms have traditionally been used somewhat 
interchangeably and indiscriminately. The introduction of non-paper 
forms such as phonograph recordings and films has modified this 
straightforward view somewhat, but traditional cataloging makes every 
effort to foster the constraint that there a one-to-one correspondence 
between the format and the medium, with the objective of identifying the 
combined format-medium with some physical shelf location. 

Further efforts to foster this constraint increasingly break down when 
digital technologies enter the picture. Digital technologies open a world 
that paradoxically is simultaneously more complex and, in some ways, 
simpler. It is more complex because now the same document or document 
format may ii..nnsically be represented in different media for different 
purposes, forcefully motivating the need to distinguish carefully between 
the format and the medium. Furthermore, different media may be used 
interchangeably for different stages of document handling, that is, for 
capture, storage, access, distribution, and use. To complicate the situation 
even more, the documents may be encoded in a myriad of ways at each of 
these stages. 

And yet, separation of the format and the medium — and treating each 
stage of document handling separately — may open up a more logical 
structure free from traditional constraints. In thii; sense, digital 
technologies may simplify certain aspects of librarianship. 

Digital technologies present many new challenges, however, that must be 
considered. For example, although these varying formats may be decoded 
and translated back and forth among each other, many fear that the means 
of decoding may become lost as a result of technological obsolescence, 
conceivably making digitally stored documents inaccessible. There are also 
many who question the longevity of the physical media used in digital 
technologies. Others suggest that the appropriate way to address both of 
these problems — as well as to take advantage of the declining costs of 
comp^tter storage and of increasing storage densities — may well be to 
copy stored documents periodically onto new media. 

Indeed the main advantage of the world of digital technologies, namely 
that they represent a kind of " esperar o" of mutually comprehensible and 
interchangeable formats, may, if not properly managed, also represent 
their biggest weakness, because of the rapidity of change and obsolescence, 
and because of the wide range of choices available at any given time. Their 

73 
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very attractiveness could lure the unwary or the uninformed into 
dangerous territory. 

Periodic recopying onto new media represents a whole new approach for 
libraries to '.he operation and financing of "inventory management" 
(although though such practices are quite common in data centers). The 
implications could be quite extensive. Librarians tend to think in terms of 
periods of centuries rather than having (or warding) to recopy every few 
years. Such considerations may either hinder the adoption of digital 
technologies for preservation or other purposes or eventually cause some 
rethinking of the underlying economics of librarianship. 

Ihe incentive for such potential reevaluation, however, is not limited to 
the preservation of older materials, nor is the influence of technology the 
only driving factor. The underlying stimulus is a gradual transition over 
the centuries — perhaps spurred by the exponential growth of recorded 
knowledge and information — from documents with associated physical 
or conceptually useful lifetimes, times between new editions, or, more 
generically, times 1 Mvc^en "instances", that can be measured in decades or 
centuries; to documents with associated times between instances 
measured in much shorter units of time — even, in the case of "active 
documents" (see below), measured in minutes or seconds. 

In essence, this represents a transition from "batch processing" to 
"continuous processing."2 The financial and other implications of this 
could undoubtedly be far-reaching for libraries (a full discussion is beyond 
the scope of this Glossary), introducing into the library milieu unfamiliar 
(or, at least, largely unused) c. ncepts associated with continuous processes 
or processes with relatively short lifetimes, such as "depreciation" and 
"lifecycle costing." These are concepts that are familiar to the world of 
digital electronic processing and quite normal outside of universities, but 
that have been avoided in worlds — su'.h as research libraries — that 
depend to a greater or lesser extent upon irregular gifts or grants of varying 
or unpredictable size, donations directed to the purchase and immediate 
storage of documents, but not to their maintenance. Indeed, one of the 
most serious questions facing librarians in the future may be how to effect 
a match between the changing economic demands of "continuous 
processes" and the traditional nature of many funding sources. Will 
donors, for example, be as willing to support the continuous demands of 
technological processing as they have historically and generously 
supported the periodic construction of library buildings? What 



2 This analogy was pointed out by Douglas van Houweling. 
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implications does the financing of continuous processes have for the 
"free" and openly accessible library ?3 

Yet the potential of digital technologies and of the flexibility they offer is 
boundless. Over the coming decades, these technologies may ooen up 
vistas of ever-increasing storage densities to where entire libraries caji be 
electronically stored in the rpace of a single room; of b.inding access and 
distribution speeds allowing whole documents to be moved almost 
instantly across the nation's (and indeed the world's) data networks, 
leading to the concept of the "distributed library;" of ease of replication at 
very modest cost (another cause for alarm, particularly to those concerned 
with protection of intellectual property); of "print-on-demand" where 
paper copies of documents are only printed "just in time" and not 
inventoried in advance of need; of accessibility at a distance away from 
where the "digital document" or preservation copy was created or is 
stored; and of intelligent automated document analysis. Indeed, the means 
of creation and production of documents have already been 
revolutionized by these technologies. 

These technologies also open up horizons for totally new document 
formats, such as active documents whose contents may combine different 
media such as text, sound, video or voice; or whose contents may change 
dynamically with time, what Harvey Wheeler called "the fungible book."^ 
The preservation of these new "active" formats is not of direct interest to 
the subject of preservation of more traditional formats (and therefore 
beyond the scope of this Glossary), but is of indirect interest because 
digitally preserved traditional documents can be incorporated into such 
active documents. Furthermore, contemporary active documents will 
become a subject of future preservation interest. 

Some view th: introduction of digital technologies into the world of 
libraries as likely to cause a revolution as far-reaching as that caused by the 
printing-press: a massive paradigm shift. Others view the introduction 
with concern (one cannot help but recall that the monks at first also 
viewed the introduction of the printing press with equal concern), an 
intimidating perturbation that disturbs an equilibrium and modalities of 
scholarship that Iiave served well for many decades or even for centuries. 

Either way, digital technologies cannot be ignored. They are already with 
us. The question is not whether they will have a presence, bu\ the pace 



3 A glimpse of possible implications has already been seen In the tendency of many libraries 
to charge patrons for searches of electronic databases. 

4 Harvey Wheeler: "The Virtual Library: The Electronic Library Developing Within The 
Traditional Library". Doheny Documents, University of Southern California University 
Library, 1987. . ^ 
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and degree to which that presence will grow and influence. The next 
twenty years are likely to be times of extraordinary change. Our libraries — 
indeed our universities, colleges, and our scholarly communities — may 
well be remade by the consequences of this technological revolution. 

And yet — in spite of technology's impact and of the revolutionary 
consequences of that impact — it must be recognized that technology itself 
is not the ultimate driving force. I: the inexorable pressure caused by the 
exponential growth of recorded knowledge, and the ever-increasing 
complexity, costs, and other problems associated with the storage and 
distribution of, and acces: to, such inforniation. Technology can provide 
some solutions: it is not an end in itself. 

Furthermore — for many reasons ^oo numerous to detail here — the 
"digital library" is not about to replace the "paper library." Both will need 
to coexist in a shifting environment, at least for the foreseeable future. 
This in itself will present librarians with many economic, organizational, 
social, technical, and other challenges. 

Between the eager apostles of technology and those v/ho approach change 
with extreme caution lies the mass of professionals who are trying to 
understand and grapple with the potential of this shifting environment, 
many of them implementing prototype activities designed to elucidate 
greater insight,^ many working to close the gap between promise and 
reality. 

It is to these professionals — from all fields — that this Glossary is 
dedicated, to provide a common language for dialogue and mutual 
understanding, particularly as is required to address the problems of 
preservation, and the potential application of digital technologies to those 
problems. The Glossary is not intended to be so comprehensive as to 
satisfy the technologist only concerned with technologies, or the librarian 
exclusively concerned with librarianship and preservation. It is intended 
to satisfy the intersection of their concerns. Oi. the other hand, issues of 
preservation and access raise concepts that have implications for 
librarianship as a whole, so that, in that sense, this Glossary has 
consequences that are not limited to the preservation arena alone. 

Sc ope of the Glossary 

This document is a structured glossary, in the sense that the terms have 
been hierarchically grouped. The term "taxonomy" was used to describe 
earlier drafts of the manuscript, but that term was dropped since il might 



5 Some fields, particularly those propelled by the impetus of commercial endeavors such as 
modicine, law, and finance, are beyond the prototype stage and are into full production. 
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imply a degree of rompleteness and form beyond that envisaged, or even 
possible, for such a document. This document is not intended to be 
complete with respect to preservation and access technologies as a whole, 
but is highly selective (and even highly subjective) in its choice of terms to 
include, and very much slanted towards the use and impact of digital 
technologies. Other preservation technologies are sketched in for 
contextual purposes only. Within these constraints, the Glossary is 
intended to be comprehensive but not exhaustive. 

The Glossary is not intended to solve all issues associated with the 
definition of technological and other terms associated with preservation 
ana access. It is a conceptual document. Not all terms are^ defined with 
equal precision; indeed, the degree of precision is largely directed by the 
extent to which it is necessary to distinguish among these terms. The 
Glossary is intended to be adequate to support further research and 
development on the subject. Indeed, one measure of success of the 
Glossary will be the extent to which it stimulates additional vvork in the 
field, including refinements of the Glossary itself. 

For the conceptual reasons outlined above, the Glossary departs from 
many well-establishod norms. Furthermore, excluded in any detail are 
terms primarily associated with conservation, such as paper 
deacidification, where every effort is made to preserve the documents in 
their original . hys' m1 form,^ or hand conservation. The focus, as stated, is 
on preservation through media conversion (traditionally known as 
"reformatting", a term which we do r . favor in this Glossary — see 3.1), 
where the objective is to preserve the intellectual content of the original 
document on some other medium, and also if desired to produce at some 
later stage a close physical facsimile of the original, at least to the extent 
allowed by the technology. 

The focus is also for the most part on paper documents requiring 
preservation These represent the principal (but not the only) area of 
national and international attention: paper documents have the longest 
history and exist in the greatest numbers. They are also in urgent need of 
preservation because of the "embrittlement" (see 1.5.4) caused by the high 
acid content of paper manufactured since the mid-nineteenth century and 
by improper storage environments. In the years to come, the focus may 
well shift to other media. There is already, for example, considerable 
attention paid to film prfiservation, and video recordings are already 
deteriorating at an alarming rate. 



6 Conservation may allow for only partial preservation of the original document. The 
binoings, for example, may be replaced while the body of the document is conserved. 
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Different technologies are more or less suitable to preserve different 
classes of documents ir for achieving different access or other objectives. 
One of the main applications intended for this Glossary is for the 
classification of ranges of activity that can be used to describe different 
investigations into preservation and access methodologies. The level of 
detail varies throughout the Glossary according to what we believe is 
necessary to make the Glossary most pertinent to this intended 
application. 

St ructure of the Glossary 

The Glossary is divided into three main sections: the Original Document, 
the Selection Process, and the Preserved Copy. The latter is dealt with in 
the most detail; in turn it is divided into a number of subsections: the first 
defines the actual preservation or media conversion technologies that 
may be employed; and the remaining subsections are devoted to the 
various technologies employed in the differr nt stages of preservation and 
access — capture, storage, access, distribution, and presentation. 

The reader will observe that there is some repetition of discussion of 
certain concepts throughout the Glossary. This is intentionally introduced, 
since it is expected that most readers will not choose to read the Glossary 
from cover to cover. 

The overall structure of the Glossary is presented in Figure 1. 
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1. THE ORIGINAL DOCUMENT 

Different preservation or media conversion technologies are 
appropriate to different kinds of original material* this section, 
therefore, is devoted to a classification of terms used in describing the 
original document to be preserved, particularly those terms that need 
to be referenced in the context of media conversion* 

The term document is used generically throughout this Glossary to 
include all forms of books, manuscripts, records and other classes of 
material containing information or other matter of intellectual 
content, regardless of the actual medium (1*1) or format (1*2) employed* 

The Glossary takes free license with terms that have taken on a 
traditional meaning in the context of cataloging and other library 
activities, and in fact frequently departs from traditional norms used in 
this area. As stated in the Introduction, the reason for this is that such 
traditional definitions often confuse the format and contc^'t of the 
document with the medium used to record it, terms that have 
traditionally been used somewhat interchangeably and 
indiscriminately. This made sense when paper was the primary 
medium used for document capture, storage, distribution, and use. 
With newer technologies, however, and particularly with those used 
for media conversion (3.1), different media can be used for each of 
these stages, and, in fact, different media can be used for different 
instances of each stage. In this context, therefore, it makes taxonomic 
sense to separate format from mediuir. 

For example, a traditional classification is "Motion pictures and video 
recordings." In our Glossary, the document format would be "motion 
pictures." The medium could be "film" or "videotape" or even "digital 
electronic" (such as with digital video). Even a book (document format) 
could be embodied in different media: "paper," "audio" (the "talking 
book"), "microform," or "digital electronic." To extend the example. 
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the book could be stored in a digital electronic medium, and 
subsequently distributed electronically, and used^ by "printmg-^on- 
demand" on paper or microform, ->r by presentation at a digital 
computer workstation. 



THE ORIGINAL 
DOCUMENT 

-i.i Medium -"^MB 

-1.3Perii»dicity 
» 1.4 Propcnics 

- IJ> Condtium 

l.l* Document Medium 

Document Medium refers to the material upon which the 
original document was .recorded. 

1.1.1. Paper 

Paper is a medium traditionally used for printed books and other 
documents that are the most frequent target of preservation 
efforts. Paper is defined to be sheets usually made of vegetable 
fibers laid down on a fine screen from a water suspension. 
Marks are imprinted on the paper using any of a number of 
techniques including handwriting or drawing using a variety of 
media such as pencil, pen and ink, or pastel; various forms of 
printing using inks (numerous technologies are used to 
accomplish this); photographic printing, where paper coated 
with light-sensitive emulsion is exposed to various intensities of 
light); xerographic printing, where an electrically charged 
photoconductive insulating surface is selectively exposed to light 
and the latent image is developed with a resinous powder; 
thermographic printing, where the paper is exposed to a directed 
heat source that selectively modifies parts of the surface that may 
have been pre-treated with a heat-sensitive powder; and 
chemical transfer printing, where the surface of the paper is 
chemically coated and selectively modified by pressure or other 
means. 
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Parchment and vellum are not paper since they are ma^e from 
the skins of sheep, goats, or calfskin^ We note them here for 
completeness. 

Hard Copy is a term of'.en used to denote any document 
produced on paper. 

1.1.2. Microform 

Microform refers to a document medium for producing or 
reproducing printed matter. It records microimages, that is, 
images too small to be read without some form of magnification. 
In a general ^'^nse, microforms may be on film (1.1.4) or paper 
(1.1.1), but for purposes of this Glossary the definition is 
restricted to film. Reading a microform requires the assistance of 
a microform reader (3.6.2.2). Microform comes in different styles 
including microfilm (a film roll that contains microimages 
arranged sequentially) and microfiche (sheets of film in which 
many microimages are arranged in a grid pattern). Both usually 
contain a header that can be read without magnification). 

Microforms are an economic and compact form of document 
representation for archival storage, but are inconvenient to read 
when compared with a printed book. Microform technology is 
used as a preservation medium (3.1.4), as a means of saving 
space (such as for the convenient storage of newspapers) , or as a 
means of duplicating scarce or unique documents, that is, 
microreproductions of other original documents. However, 
microform is sometimes used for original documents for 
example, those created on a computer and directly printed out 
onto a computer-outpiit-on-microfiche (COM) device; and for 
microreproductions of material assembled for the purposes of 
releasing an original edition in microform. 

1.1.3. Video 

Video is normally an analog (see definition under 1.1.6) 
electronic technology for recording still or moving images, 
usually combined with sound (cf. 1.1.5). Following standards 
(which vary ac;jss the world) defined for television playback 
and broadcasting, the images are normally recorded on magnetic 
tape (3.3.1.6.2), when it is known as videotape, but also on other 
physical media such as optical disk (3.3.1.6.3) (videodisk). 



7 Originally, the term "vellum" was rectrlcted to calfskin. The distinction between 
parchment and vellum has eroded over the years. 
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Playback is usually achieved through a television set or video 
projector (:3.6.2.3), although it is now possible and becoming 
common to. play video recordings back through a computer 
(3.6.2.6) or multimedia workstation (3.6..17). 

1.1.4. Film 

Film is a recording medium consisting of thin sheets or strips of 
transparent or Iranslucenl material, such as polyester or acetate, 
coated with a light-sensitive emulsion. Recording occurs by 
exposing the film to the light emitted or reflected by the entity 
being recorded. Film is also the medium used for microfilm 
recording (1.1.2). A photograph (1.2.9.3) is produced using 
essentially the same technology, except that normally the light- 
sensitive emulsion is ad!:cred to paper or some other opaque 
medium. 

1.1.5. Audio 

Audio documents are recordings made on a variety of (usually) 
magnetic media (see 3.3.1.6) of sounds only (as contrasted with 
video recordings (1.1.3) that also combine images^ The 
evolution of such audio recordings has traversed a large number 
of different formats and physical media, including phonograph 
disks (records) of varying size (78 rpm's. 45 rpm's, 33 rpm's) and 
tape cassettes (of different foimats), both of which are analog (see 
1.1.6) recording technologies; and, more recently, compact disks 
and digital acoustic tapes (DATs), which are digitally (1.1.6) 
encoded. 

1.1.6. Di gi cal Electronic 

Digital Electronic Technologies^ are technologies used to capture 
(3.2.3), store (3.3.1.6), transform (3.3.2, 3.3.4), distribute (3.5.1.6) or 
present (3.6.1.6, 3.6.2.6, 3.6.2.7) information in quantized 
electronic form (normally as a sequence of O's and Vs known as 
bits). Digital, in which information is quantized discretely, is to 
be contrasted with Analog, in which information is not 
quantized but maintained in a continuous format.^ A video 



8 The term digital tectinologies is also used for brevity throughout this Glossary. 

9 The non-technical reader may wish to compare the oJometer of a oar (a digital device 
which quantizes in precise 1/1 0th of a mile increments) wia^i the speedometer (an analog 
device which displays speed continuously but which can only be interpreted approximately). 
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recording (1.1.3), is an example of an electronic technology that 
is analog^o 

For a variety of reasons, digital technologies are gradually 
replacing analog technologies. Reasons of importance to this 
Glossary are the convertibility of digital technologies among 
each other and into and from other technologies (such as paper 
and voice), so that digital technologies beCoir»e a kind of lingua 
franca of communication and storage; and the ease of 
transmission of information by digital technologies across 
netv^orks (3.5.5) to facilitate communication at a distance. 

Original documents that are of concern for library preservation 
purposes are not normally encoded in a digital electronic 
rnedium.^^ Since this may become a subject of future concern, 
the category is included for completeness. Definitions, hov^ever, 
are more appropriately included under Storage Technology 
Medium (3.3.1.6). 

1.1.6.1 Magnetic Disk (see 3.3.1.6.1) 

1.1*6.2 Magnetic Tape (see 3.3.1.6.2) 

1.1.6.3 Optical Disk (see 3.3.1.6.3) 

1.1.6.4 Optical Tape (see 3.3.1.6.4) 

1.1.6.5 Magneto-Optical Disk (see 3.3.1.6.5) 
1.1.7. Multi-Media 

Multi-Media is a term used to denote documents created using a 
number of different media simultaneously, usually those with 
an electronic technological basis: for example, a digital electronic 
recording (1.1.6) that also combines video (1.1.3) and audio 
(1.1.5), and that may, as part of the document, intrinsically 
produce paper (1.1.1) outputs. 



I 0 However, c//g/7a/(ly-encoded) video is now becoming part of the panoply of technologies. 

where analog video signals are converted to digital signals for purposes of storage, 
transmission and playback through a computer (3.6.2.6) or multi-media (3.6.2.7) 
workstation. 

I I This assertion, however, may not be true in the future. For example, music Is now 
recorded in digital electronic form, such as DDD Compact Discs. 
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1.2. Document Format 

Document Format refers to the class of document with respect to 
its style, arrangement, or layout. 

Although this Glossary emphasizes the distinction between 
format and medium, some formats are more closely associated 
with a given medium. Thus, form?.ls such as documentary, 
short, feature, and newsreel are most closely associated with the 
medium of film. Consistent with the main thrust of this 
Glossary, we emphasize those formats that are mostly associated 
with the medium of paper, even though several of these formats 
may also be embodied in other media (the "talking book," for 
example, recorded, say, on tape cassettes). 

The term "format" itself may be too all-encompassing. There 
may be a need to further distinguish betAreen the "type" of a 
document, such as "book," and the arrangement or layout of the 
book — such as formatted ter.t on pages, or simply linear text 
that is not formatted into pages (as in the "talking book" where 
pages are not distinguished). Hcwever, this Glossary does not 
make this distinction, partly because of its focus on the paper 
milieu, where such a distinction may not be necessary, and 
partly because in Ihe emerging world of digital technologies it 
may be premature to attempt such a distinction. 

The use of the term "format" should not be confused with its 
use in the context of 'reformatting." The latter, as described in 
3.1, is best replaced by the term "media conversion." 
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1.2.1. Manuscript 

For purposes of this Glossary, an original, unpublished 
document directly created by its author(s), usually on paper or 
parchment, and often in the author s own hand. 

1.2.2. Book 

A monograph (1.3.1) publication containing more than 49 oages, 
usually on paper.'^ 

1.2.3. Pamphlet 

A complete monograph (1.3.1,^ of at leaii 5 but not more than 49 
pages, usually on paper (see Footnote 12). 

1.2.4. Newspaper 

A serial (1.3,2) publication issued at stated, frequent interva!s 
containing news, opinions, advertisements, and other topical 
material, usually on paper (see Footnote 12). 

1.2.5. Printed Sheet 

A single sheet of printed paper such as a poster (but see 1.2.9.4), 
broadside, folded leaflet, or memorandum, usually on paper. 

1.2.6. Periodical 

A serial publication (1.3.2) appearing at regular or stated 
intervals, generally more frequently than jnnunlly, usually on 
paper (see Footnote 12). Includes magazines and jQurnals. 

1.2.7. Cartographic Materials 

Representation:? of a selection of abstract features of the 
universe, most often in relation to the surface of the earth, often 
on paper but also on other substrates. 



1 2 Although an Increasing number of books are published on otrer media (see tho lr>troduction 
to this Section). This remark also applies to 1.2.3, 1.2.4, 1.2.5, 1.2.6, and 1.2.8. Video 
magazines and journals, for example, are beginning to appear. A few books are being 
published only in digital form for playback on a computer workstation. 
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1.2.8. 



Music 



In this context, printed representation of musvral notation for 
instrumental chamber/ orchestral, and vocal scores, usually on 
paper (see footnote 12). 

1.2.9./ Graphic Materials 

1.2.9.1 Art Originals, Prints, and Reproductions 

Illustrated works, such as drawings, engravings, and lithographs, 
issued separately from books. 

The following terms are included for completeness, but without 
definitions^: 

1.2.9.2 Filaistrips 

1.2.9.3 Photographs, Slides, Transparencies, and Stereographs 

1.2.9.4 Pictures, Postcards, and Posters 

1.2.9.5 Technical Drawings (including Architectural Plans) 

1.2.9.6 Miscellaneous 

The MhceUmwoiis category includes flash cards, radiographs, study 
prints, and wall charts. 



1.2.10. Data File 

The term Data File is used generically to denote a document 
consisting of a collection of d*:ta, normally organized in some 
logical fashion so as to facib^ate access (3.4). Such data may 
consist of factual information, statistics, numbers, textual^ or 
composite records to be used as a basis for reasoning, discussion, 
oi calculation. An entity within a data file is known as a (data) 
record. A collection of data files is sometimes known as a 
databank, particularly whon the data files are electronicaily 
encoded (1.1.6). 

Although data files may be encoded in any media (for example, a 
paper card index file is an example of a data file), the term has 
most often come to be used in connection with data files that are 
electronically encoded and stored -n digital electronic form 
(3.3.1.6). 



1 3 In keeping! with .he spirit noted in the Foreword that this Glossary is intended to be 
comprehensive but not exhaustive. 
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1.2.10.1 Table 



A data file arranged into two-dimensional form, normally consisting of 
rows and columns together with headings or labels to depict the 
contents of the rows and columns. Tables may themselves contain other 
tables as elements resulting in a "latticed" arrangement of data. A 
spreadsheet is a special form of table originally used for accounting 
purposes and containing financial data, but which now includes a wide 
variety of complex reports arranged in tabular form, often with the aid 
of computer Vvcrkstations (3.6.2.6). 
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1.3. Document Periodicity 

Periodicity refers to the number of parts into which the 
document is divided and the manner or sequence in which 
those parts are or have been publish ''d. 



1.3.1. Monograph 

A Monograph is a published work, collection, or other 
document that is not a serial (1.3.2). 

1.3.2. Serial 

A Serial is a publication issued in successive parts, bearing 
numerical or chronological designations, at regular or irregular 
intervals and intended to continue indefinitely. 
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1-4. Document Properties 

Document Properties refers to a classification of various 
components of documents as to their differenf tonal or color 
content and as to the types of objects^^ they contain. Emphasis is 
placed on those properties most closely associated with 
documents produced on paper. 

1.4.L Tone 

Tone refers to the color quality or color content of the document 
or parts of the document regardless of form or material content. 

Monotone 

Monotone documents (or parts of documents) are printed or otherwise 
produced using one color hue^^ only, most often black or near-black. 

1.4.1.1.1 Two - Tone 

Those parts of a monotone document that are represented in only 
two contrasting tones (regardless of the hue of the color, 
although the term is most often associated with black hues), 
with no intermediate shades. Thus, for purposes of this 
Glossary, a book printed with red ink on yellow paper would be 
considered two-tone. When one of the shades is b!r.k or near- 
black, and the other white or near-white, the document is 
described as being produced in black-and-white. 

1.4.1.1.2 Greyscale 

Those parts of a monotone document that are presented using a 
range of tones (regardless of the hue of the underiying color). 
The range of tones may either be continuous (such as in a 



14 The Term ''objecr is used hera in a sense that is more familiar to computer professionals 
than to librarians. 

15 Strictly speaking, monotone documents should be termed "monohue" 
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photograph), where all possible values may essentially be 
taken on, or discrete, where only a finite set of values may be 
taken on. 

1.4.1.2 Highlight Color 

A two-tone (1 ALl.l) document, parts of which additionally contain 
areas highlighted with a second single color of uniform shade. 



1.4.1.3 Two Color 



A document containing two colors, intermixed to create intervening hues, 
and two extreme tones (normally black and white) used to create a 
continuous or discrete (see 1.4.1,1,2) range of shades. 

1.4.1.4 Full Color 

A document containing or attempting to contain a full range of colors, 
normally of all hues, tones, and shades, 

1.4.2. Object Type 

Object Type (see also Footnote 13) is a descriptor that conveys 
information about a given sub-area (object) of the document 
with regard to the manner in which it convevs data or 
information. 

1.4.2.1 Text Objects 

Text Objects are document objects consisting of written or printed (or 
otherwise displayed) stored words or ideograms. 



1.4.2.2 Data Objects 

Data Objects are document objects consisting of factual information 
normally arranged into datafiles (1.2.10) or tables (1.2.10.1) which are 
used as a basis for reasoning, discussion, or calculation. 

1.4.2.2.3 Table 

See 1.2.10.1. 

1.4.2.3 Graphic Objects 

Graphic Objects are document objects containing image information 
consisting of artwork, photogiaphs, technical drawings etc, perhaps 
containing limited amounts of text usually :iS captions or for labelling 
purposes. 
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1.4.2.3.1 



Line Art 



Graphic objects created entirely from the use of text, dots, and 
straight or curved lines. 

1A2.3.1.1 Graphs 

Line ^rt objects consisting of representations of the 
interrelatio.iships of data in pictorial form. 



1.4.2.3.2 Halftone 



A representation of a greyscale (1.4.1.1.2) or color graphic object 
as a series of dots obtained, for example, by photographing or 
scanning an image through a mesh screen. By limiting the dots 
to, say, black and white (for exar^.ple, by using high-contrast 
film), the iUusio*^ of.greyscale may be created in a two-tone or 
black-and-white document (1.4.1.1.1). 

1.4.2.3.3 Discrete Tone 

A greyscale or color (1.4.1.4) graphic object where the tones 
take on discrete (normally equispaced) values within a range. 

1.4.2.3.4 Continuous Tone 

A greyscale (1.4.1.1.2) or color (1.4.1.4) graphic object where the 
tones fall continuously across an entire range of values, such as 
in a photograph (1.1.4, 1.2.9.3). 
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1.5. Document Condition 

Condition refers to the physical state of the document compared 
with its state when originally published. The following presents 
only those characteristics of the physical state of a document that 
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are pertinent to the main thrust cf this Glossary, that is, to the 
papei* milieu* 



1.5.1. Archival 

A document that can be expected to be kept permanently as 
closely as possible to its original form. An archival document 
meditim is one that can be "expected" to retain permanently its 
original characteristics (such expectations may or may not prove 
to be realized in actual practice). A document published in such a 
medium is of archival quality and can be expected to resist 
deterioration. 

Permanent paper is manufactured to resist chemical action so as 
to retcrd the effects of aging as determined by precise technical 
specifications. Durability refers to certain lasting qualities with 
respect to folding and tear resistance. 

See alsc 3.3.5. 

1.5.2. Non - Archival 

A document that is not intended or cannot be expected to be kept 
permanently, and that may therefore be created or published on 
a medium (1.1) that cannot be expected to retain its original 
characteristics and resist deterioration. 

1.5.3. Acidic 

A condition in which the concentration of hydrogen ions in an 
aqueous solution exceeds that of the hydroxyl ions. In paper, the 
strength of the acid denotes the state of deterioration that, if not 
chemically reversed (3.1.2), will result in embrittlement (1.5.4). 
Discoloration of the paper (for example, yellowing) may be an 
early sign of deterioration in paper. 

1.5.4. Brittle 

That property of a material that causes it to break or crack when 
depressed by bending. In paper, evidence of deterioration usually 
is exhibited by the paper's inability to withstand one or two 
(different standards are used) double corner folds. A corner fold 
is characterized by bending the corner of a page completely over 
on itself, and a double corner fold consists of repeating the action 
twice. 
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1.5.5. Other 



There are many other conditions ihat characterize the condition 
of a document. Bindings of books, ^or example, may have 
deteriorated for a variety of conditions. Non-paper documents 
may exhibit a variety of conditions (see, for example, 3.3.5 for a 
discussior of the concept of ''Useful Life'O. However, with the 
focus on paper original documents and on media conversion 
technologies for preservation, a full analysis of document 
condition would be beyond the scope of this Glossary. 
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1.6. Docume ni: Content 

Document Content refers to the substance of the material or 
information within the document that is intended to be 
communicated. 

1.6.1. Intellectual Content 

Intellectual Content refers to the ideas, thought processes, artistic 
expressions, etc., contained within the document. 

1.6.2. Copyright ^6 

Copyright refers to a means of legal protection provided to the 
author(s) of original published and unpublished works that 
have been "fixed in a tangible form of expression," in order to 
afford such authors the exclusive right of exploitation, in 
particular the right to control the reproduction, distribution, 
performance, or display of the work, or to control the 



1 6 Copyright law as it applies to the subject of preservation will oe the subject of a 
forthcoming paper by the Commission on Preservation and Access. 
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preparation of derivative works. Otten, exploitation of the 
work by others requires the consent of the author(s) and the 
payment cf a royalty to the author(s), usually in the form of a 
fixed sum of money for each copy made, shown, or distributed. 

For works copyrighted in the United States after January 1, 1978, 
protection afforded to the author(s) or the author(s)' estate is 
usually for the author(s)' lifetime plus 50 years. For works 
created prior to that date, the copyright period was 28 years from 
the date of publication (or the date of registration of copyright for 
unpublished works), plus an additional period of 47 years for 
works whose copyright was renewed during the last year of the 
first term. 

Works published in the United States may be afforded protection 
in countries that were members of the Universal Copyright 
Convention or of the Berne Convention for the Protection of 
Literary and Artistic Works. Conversely, works published in 
such member countries are protected within the United States. 

Most works that are the subject of preservation interest were 
published before 1978. The copyrights on the majority of those 
works were not renewed for the optional second term. Thus, the 
copyrights have expired on most of the works of current 
preservation interest that were subject to United States copyright 
protection. However, since this is not true of all such works, the 
normal practice is to check copyright ownership to verify 
clearance. 



1.6.3. Structure 

Structure refers to the divisions within a document provided for 
ease of access, reference, and other purposes. The broad structure 
of a given document is likely to vary according to its format (1.2), 
and there is also not necessarily any standard structure for a 
given format. With its long history, the structure of the printed 
book (1.2.2) has evolved towards a somewhat standard structure. 
Because of the focus of this Glossary on the preservation of che 
printed book, a typical book structure is presented here and 
structures for other formats are omitted. 



1 7 For a fuller explanation of copyright laws, see "Copyright Basics^ Circular No. 1. 

published by the Copyright Office of the U.S. Library of Congress, Washington, DC 20559. 
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1.6.3.1 AbU-^ct (see 3.4.1.2) 

1.6.3.2 Tiil0 Page 

The Title Page of a work normally contains the title of the work, its 
author(s), and the name of the publisher. 

1.6.3.3 Table of Contents (see 3.4.1.3) 

1.6.3.4 List of Figures, Tables, Maps or Other Illustrations (see 3.4.1.4) 

1.6.3.5 Preface (see 3.4.1.5) 

1.6.3.6 Introduction (see 3.4.1.6) 

1.6.3.7 Body 

The Body of a document refers to the main corpus of the work. It may be 
divided into chapters, papers, articles, or other segments. 

1.6.3.8 Index (see 3.4.17) 

1.6.3.9 Other 

This category includes publisher's notes, credits, frontispieces, and 
other minutiae of publication. 
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2. THE SELECTION PROCESS^^ 

The Selection Process refers to th$ ^neans whereby original documents 
are selected for preservation purposes* The choice of selection strategy 
may be intrinsically affected by the choice of preservation or media 
conversion technology used (see 3*1), since the k<ter may well affect 
costs and other parameters associated with the former* Thus, the total 
costs of preservation will be a complex combination of the effects of 
selection strategy and choice of technology* 

Thi!5, for example, with the use of microform (3*1*4), it is highly 
desirable (if not imperative) to obtain a complete copy of the document 
to be preserved prior to recording* This may require replacing missing 
or damaged pages from the prime copy being microfilmed, and the 
expense of obtaining these pages from copies held in other libraries* 
Microfilming also places premium on recording only once* With the 
use of digital technologies (3*1*5), on the other hand, such replacement 
pages could be scanned at a later date and electronically "edited" into 
the main electronic document: with digital technologies, it may in fact 
be cheaper to scan more than one copy to facilitate such "editing" 
rather th:'n to expend excessive manual labor on assembling the most 
perfect paper copy possible prior to microfilming* 

The following is a brief — and very over-^simplified — classification of 
selection methodologies* It is only intended to sketch the range of 
possibilities and not to do full justice to the complexity of this subject* It 
merely indicates some of the main lines of strategy or process used in 
selecting documents for preservation* Furthermore, often a 
combination of approaches is used rather than any single approach, 
with the actual condition of the document being the dominant factor 
in the choice* 



1 8 See also "Selection for Preservation or Research Library Materials." a Report of the 
Commission on Preservation and Access, August 1989. 
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In all cases, the "universe" of documents to which the selection 
strategies outlined in this Section are applied is those documents that 
are deteriorating or are likely to deteriorate, such as brittle books cr, 
more generally, books printed on acidic paper. "Preservation", 
however, may also be applied to the conversion onto other media of 
materials that, while in quite good conc^ition, are scarce or unique, thus 
allowing patrons to handle facsimiles instead of the precious originals* 

The term "essentially all documents" is used b 'ow to define 
documents from within the former universe that lit within the 
indicated selection strategy, while allowing that a number of these 
selected documents may yet be rejected following review for various 
reasons (such as having deteriorated to the point that preservation is 
not possible, or because it has been determi..ied that the document has 
already been preserved elsewhere). 

2-L By Title 

Selection is made from among individual works, perhaps by 
professional bibliographers who, possibly working in 
consultation with others, make a determination of the value of 
th^^ selected work to a given collection, discipline, or field of 

2-2. By Category 

Selection is made by choosing essentially all documents from a 
within a given category, such as within a given lime period, or 
of a given format (for example, all newspapers), subject 
classification, special collection, or , say, American imprint. The 
essence of this approach is that all documents within the 
category be readily and conveniently definable and accessible, 
without having to resort to time-consuming selection processes. 

Colloquially, this approach is sometimes erroneously termed the 
'^vacuum cleaner approach", an appellation that is overly 
pejorative insofar as some prior review is almost always made to 
reject materials within a cafcgory that for va lous reasons are not 
suitable or desirable for preservation. In particular, a check is 
made to ensure that the material has not already been preserved. 

Selection, for example, by time period permits the focus of effort 
on those periods of highest risk of deterioration with respect to 
paper-manufacturing processes. 
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2.3 



By Bibliography 



Selection is made by choosing essentially all documents specified 
in a published bibliography. 



Selection is made by choosing essentially all documents in poor 
condition that are actually used by patrons as judged by some 
criterion such as, for exan^ple, frequency of circulation. 



Selection is made by preserving the dornments in the worst 
physical condition. 

The foregoing are examples of selection according to certain established 
criteria. Selection may also be made according to established 
procedures: 

2.6. By Scholarly Advisory Committee 



Selection is made with the assistance of a committee of scholars 
knowledgeable in a particular field who choose the material they 
consider to be of most importance to that field. 



Selection is made from institutional collections determined in a 
program initiated by the Research Libraries Group (RLG)^^ and 
described in the RLG Conspectus. The Conspectus describes 
collections on various levels from Level 0 (Out-of-Scope, a level 
which is in fact non-exist^wt), through Level 4 (Research), to 
Level 5 (Comprehensive). Collection development offi<':ers 
(selectors) in about 50 major research libraries in the U.S. have 
evaluated their own collections to provide such brief 
descriptions. The Conspectus can be used as one of several 
means to determine "Great Collections." 



1 9 The Research Libraries Group, Inc.. i9 ^ not-for-profit corporation owned and operated by 
its governing members: major universn.^s and research institutions in the Unitt States. 
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3- THE PRESERVED COPY 

This section addresses technologies employed in the preservation 
process* The first section broadly classifies different kinds of 
preservation processes* The remaining sections focus on the different 
technological stages associated with preservation processes dependent 
upon media conversion technologies* These are: capture technologies, 
storage technologi s, access technologies, distribution technologies, and 
presentation technologies* 

The divisions among these various stages of technology may, at first, 
seem artificial, particularly to those used to v^orking with paper. For 
example, v^e distinguish betv^een the storage medium (3.3.1), the 
distribution medium (3.5.1), and the presentation medium (3.6.1). In 
the v^orld of paper, as stated in the Introduction, these are usually all 
one and iite same, even though the same paper book, say, may play 
different roles at different times. When it is on the library bookshelf, it 
is a storage medium; v^hen it is being messengered through inter- 
library loan, it is the distribution medium; and v^hen it is being read by 
the patron, it is the presentation medium. In th.. v^orld of convertible 
technologies, the separation becomes more than convenient sophistry 
— it becomes essential, since different media may v^ell be used at any 
stage of the process. Consider, for example, a table from a scientific 
journal article (paper: the storage medium), v^hich is FAXed across the 
nation using a data netv^ork (digital electronic: the distribution 
medium), and printed out directly onto photographic slides (film; the 
presentation medium) for projection in a lecture. 

Indeed, in the preservation milieu, this conceptual separation also 
offers considerable flexibility. It offers the flexibility of separating the act 
of preservation itself from the ultimate means of storage and delivery. 
Thus, for example, microfilming may be used as a preservation process 
(3.1.4), but the microfilm contents may be printed later onto paper for 
user presentation purposes Or the microfilm may be digitally scanned 
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and the contents stored on computer files for subsequent distribution 
across %»etworks. As another example of this flexibility, images scanned 
and stored using digital preservation techniques (3.1.5) may later be 
interpreted using internal character recognition (3.2.5) or page 
recognition (3.2.6) technologies. 

The point is that the ultimate use of the preserved document may not 
be well-articulated at the time of preservation. Thus, preservation 
technologies that offer the greatest flexibility are to be preferred to those 
(sv.ch as photocopying (3.13)) that offer less flexibility, although lack of 
funds and patron preference often dictates the use of the latter. 

The distinction between the various technology stages is maintained 
throughout this Glossary. 



TimPRi;!»n.RVI:l) 

copy 
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3.1. Preservation and Media Conversion Technolog ies 

Many different technologies have been proposed to address the 
problems of preservation. These can be divided into threi: broad 
categories: those directed at preserving both the content and 
physical embodiment of the original, those directed at 
preserving the content and copying the physical embodiment, 
and those directed at preserving the content only, without 
concern for the physical embodiment. Conservation and paper 
deacidification fall into the first category. The remaining 
technologies described below fall into the other categories. 

In the second category every effort is made to copy the physical 
embodiment or format of the original as faithfully as possible, 
normally onto another medium, ""he term media conversion 
technologies is thus used for this class {note: this does not 
exclude copying a paper document onto another paper 
document: media conversion has still occurred). Media 
Conversion includes photocopying (3.1.3), microform recording 
(3.1.4), and the use of electronic digitization techniques (3.1.5) 
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The third category makes no attempt to preserve or copy the 
physical embodiment of the original. For example, merely 
rekeying the text (see 3.2.8) of a document composed entirely of 
text preserves only content and nothing else if no attempt is 
made to capture font and other formatting information. 

Among librarians, Ihe term 'reformatting" has Iraditioudlly 
been used for "media conversion." The former term is not used 
in this (Glossary because of possible confusion with the concept of 
Documen*^ Format (1.2). Furthermore, "reformatting' does not 
do justice to the concept of copying onto microform (3.1.4) or of 
digital scanning (3.1.5).2« 

This necessarily brief glossary of different preservation 
approaches also summarizes some of the key issues involved in 
comparing the various alternatives. 

3.1.1. Conservation Treatment^^ 



The treatment of a document to preserve it in its original forni, 
in recognition that the original medium, format, and content are 
all important for research and other purposes. Pure conservation 
approaches are normally hand^tailored to the individual 
document and, as such, may be relatively expensive. Use is 
normally, therefore, limited to those situations where such 
expensive treatment is justified by the research requirements. 

3.1.2. Paper Deacidifjcation and Strengthening — 

The treatment by chemicals to stabilize a document (in paper, by 
nlkalization to neutralize the acid content) and/or to strengthen 
ii (in paper by the use of a support coating or by impregnation). 
The alkalization treatment also usually entails depositing an 
alkaline reserve to buffer against further acidification. 

Deacidification or strengthening can be applied to indiv ual 
documents or, with sor^*) treatment processes, to a large number 



20 It is tempting to use the term ''remediate** for "media conversion." a temptation that has 
been resisted in the formulation of this Glossary. 

21 For a discussion of the importance of conservation see "On the Preservation of Books ana 
Documents in Original Form." by Bar ay Ogden. Report of the Commission on Pr'3.«=3rvation 
and Access, October. 1989. 

22 iror more information see "Technical Considerations in Choosing Mass Deacidification 
Processes." by Peter G. Sparks published by the Commission on Preservation and Access. 
May 1990. 
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of documents at once {mass or bulk deacidification). The latter is 
a relatively cheap approach, and pilot plants have been or are 
being established in a number of countries to support different 
processes. There is, hovs^ever, no standard approach at this time 
even though there appear to be a number of promising 
alternatives. There are also a number of unanswered questions 
at this time regarding the longevity of chemical stabilization 
processes, toxicity, the feasibility of scaling processes to full 
production requirements, the potential continuing "offgassing" 
implications to patrons resulting from the storage of thousands 
of treated volumes in confined library spaces, and other issues. 
Recent research appears to be addressing many of these concerns. 

Deacidification is essentially a stabilization process that arrests 
deteric -ation. It does not turn brittle books back to their original 
state, although coating or impregnation can strengthen the paper 
to extend its useful life. Its greatest utility may lie in arresting 
embrittlement in books that are not too far gone, or for 
prophylactic protection of nev^ or old books that have not yet 
started to turn brittle. Deacidification may also "buy time" in 
anticipation of later preservation by other processes. 

3.1.3. Ph* copying 

Photocopying refers to the process of preserving the document 
by making a full-size (usuall ^ bound similarly to the original) 
facsimile copy on archival (1.5.1) paper by creating a 
photographic copy of the images of the pages contained in the 
document, possibly using a photocopier (3.2.1). As used here, 
photocopying refers to an in-line process v^here the original is 
scanned and one or more photocopies made all in one pass, v^ith 
no form of retained intermediate storage being automatically 
generated (as contrasted with microform recording (3.1.4)) so that 
more copies can be made in the future. In actual practice, 
however, when photocopying is used for preservation it is 
customary to make a second photocopy that is retained in 
unbound form, so that further copies can readily be made in the 
future from this master copy. 

A distinction is made between straight photocopying, which 
does not necessarily involve the use of archival paper (1.5.1), and 
preservaiion photocopying, which does require the use of 
archival paper. 

Tl.. advantages of making such a facsimile are that normally a 
single paper facsimile is produced that is quite faithful to the 
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original, there is no machine interface required other than the 
photocopier itself, the medium (LI) and forma t( 1.2) o»f the 
original are retained, and the cost is usually less than other 
processes, particularly if the original is a monochrome 
document. Furthermore, library patrons prefer paper facsimiles 
to the use of, say, microforms (3.1.4), except where bulky 
documents, such as newspapers, are involved. The 
disadvantages, as compared with microform recording (3.1.4) 
and electronic digital preservation (3.1.5), is that normally 
second copies made from the master copy are of poorer quality 
than, say, prints of microforms made from master microforms. 
Furthermore, the costs of making subsequent copies is higher 
than the cost of printing microforms. Another disadvantage, 
shared to a greater or lesser extern with microforms, is that 
photocopying does not precisely reproduce all the information 
in the original, and there is some loss of information, especially 
for graphic objects (1.4.2.3) involving other than line art 
(1.4.2.3.1). 

3.1.4. Microform Recording 

Microform Recording refers to the process of preserving the 
document by filming the original document onto a microform 
film negative (1.1.2), that is, storing microimages of the pages or 
segments oi the document on film. Positive film copies, which 
can be produced inexpensively, are made from this original film 
negative or master. Such a positive copy is both a storage (3.3) 
and distribution (3.5) technology, and is normally viewed using 
a microform reader (3.6.2.2), or paper positive prints may be 
made from the positive microform using printing devices 
designed for the purpose. Access to microfilm (1.1.2) using such a 
reader is serial (cf 3.3.1.6), whereas access to microfiche (1.1.2) is 
random (cf 3.3.1.6) like a book. 

The advantages of microform are that the process is 
economically competitive with other processes; that film has a 
long useful life (3.3.5); and that microform copies — made from 
a second negative^^ (known as the printing master) copied from 
the original negative — may be made cheaply and distributed 
among other institutions, so that access is not limited to a single 
facsimile. Microform preservation is a well-tried, tested, and 
accepted method of preservation. 



23 The original, or preservation, negative should not be viewed with a microform reader 
(3.6.2.2) because of potential damage to the negauve. 
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The disadvantages are that there is usually a loss of information 
in the recording process, particularly in recording continuous 
tone imagery (1.4.2.3.4), since the film used is usually of high 
contrast;24 and that readers dislike using microform readers 
compared with, say, reading books. 

Mircoform-preserved documents can subsequently bo converted 
to other media besides paper. They can be scanned (3.2.i) and 
converted to digitally-encoded documents (3.1.5) to take 
advantage of the benefits of digital encoding for storage, 
distril>ution, and access. However, any loss of information in the 
original recording process will be perpetuated in the subsequent 
digital recording. 

3.1.5. Electronic Digitization 

Electronic Digitization refers to the capture of the document in 
electronic form through a process of scanning (see 3.2.3) and 
digitization. The scanned image is stored electronically, usually 
on magnetic (see 3.3.1.6.1 and 3.3.1.6.2) or optical (see 3.3.1.6.3 and 
3.3.1.6.4) storage media. The electronically stored image may be 
further transformed for reasons such as compression (see 3.3.2) 
or information interpretation (see 3.3.3); and subsequently 
selected through the use of access technologies (see 3.4), 
difitributed through the use of distribution technologies (see 3.5), 
or viezoed through the use of presentation technologies (see 3.6). 

When originally scanned, or as a result of subsequent 
transformations, the document may in whole or in part be 
stored in image (3.1.5.1), unformatted text (3.1.5.2.1), formatted 
text (3,1.5.22), or compound (3.15.3) form. The distinction is 
important insofar as it affects inter alia the extent to Vv^hich 
information such as text in the scanned document may be 
interpreted (3.2.5, 3.2.6, 3.2.7) and used for purposes of 
information access (3.4, in particular 3.4.2, but see also 3." .S.l, 
3.1.5.2, 3.2.4). An image representation is an electronic pictorial 
representation composed of dots (black and white, greyscale, or 
color) much like a halftone (1.4.2.3.2) printed photograph, and 
no distinction is made between text and other information (such 
as graphs, pictures, and so for h) contained in the document — 
in other words, the letter "b" is not stored as a character per se, 
but as a "digitrd picture" of the letter "b", and the series of 
numbers stored to represent the picture would be quite distinct 

24 Newer processes becoming available appear to remove the obstacle of high-contrast 
recording. 
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among different typestyles used. Text representations, on the 
other hand, represent text as text, with a specific code used to 
denote the letter "b" independent of what typestyle is used. 

In>*age representations cannot be searched for words or phrases: 
text representations can. Image representations of text may be 
converted into formatted or unformatted text representations 
using OCR (3.2.4) or ICR (3.2.5) techniques, but with loss of 
accuracy. In the context of preservation, ima ;e representations 
are likely to dominate, since the cost of transiforming image into 
text representations with sufficient accuracy may be prohibitively 
high, at least in the immediate future. Thus full-text searching, 
for example, is not likely to be a feature of digitally-preserved 
documents. This is unlike the situation that exists with 
documents where the text already exists in digital electronic 
form, such as if the publisher had preserved the original tapes 
used in typesetting. 

If and when OCR techniques are able to convert image format to 
text format with sufficient accuracy and performance, then the 
archives of digitally-preserved material in image format can be 
converted to text format using ICR (3.2.5) techniques, provided 
the original material was scanned with sufficiently high 
resolution (3.2.3). Furthermore, promising research has been 
done recently on the searching of documents for retrieval 
purposes using the "corrupted" (erroneous) text derived from 
the OCR or ICR scanning of image documents at existing levels 
of OCR /ICR accuracy and performance. 

The advantage of electronic digitization is that it potentially 
combines the advantages of photocopying and microform 
recording while eliminating some of the disadvantages. Paper 
facsimiles can be produced at will by printing-on-demand (3.5.4) 
on paper (c writing the appropriate signals on whatever might 
be the appropriate output medium, in the case of video, film, or 
sound), thus eliminating the need for awkv^wd microform 
readers. Alternatively, the stored images c^i» oe reconstructed 
and viewed at computer workstations (3.6.2.6). Furthermore, the 
stored digital images can be distributed essentially at will across 
data networks (3.5.5) for sharing among institutions. The content 
of the stored images can also be interpreted at any time (3.2.5, 
3.2.6, 3.2.7) after recording (whenever it might become 
economically desirable to do so) for purposes of, say, vreating 
indices for access purposes (3.4.1). 
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Another key advanta^ • is the robustness of digital encoding. 
Further copies, including copies made in new formats (3.3.3) on 
other digital electronic storage media (3.3.1.6) for purposes of 
extending the useful life of the digital copy ^see Introduction and 
3.3.5), can be made without loss of information, as contrasted 
with photocopying (3.1.3) or microform, recording. (3. 1.4). 
Furthermore, scanned images can be digitally enhanced (3.2.9) to 
improve the image quality. 

The disadvantagei.^ are that this is a new and relatively untried 
technology, and the cost and other trade-offs are uncertain at this 
time. There are also concerns about the useful life (3.3,5) of 
present storage media, both in terms of the physical properties of 
the media and in terms of the robustness of the recording format 
(3.3.3) and of the means of access. Some, however, take the view 
that it will be both functionally and economically imperative in 
any event to recopy the data from storage medium to storage 
medium every few years to take advantage of the rapidly 
declining storage costs and increasing storage capacities of the 
technology, and that the useful life of a given medium is not the 
relevant issue (see Introduction and 3.3.5). 



3.1.5. 1 Image Document 

A representation of the document ima^c is electronically captured 
(usually with the aid of a digital image scanner — see 3.2.3) or created 
without interpretation of its actual content. This is stored as a sequence 
of Is or O's (known as bits), a "digital photograph" as it were In 
certain image representations, a "1" indicates "black" and a "0" 
indicates "white" (Binary Encoding), but usually the representation is 
encoded in more complex representations (see 3.3.4 Encoding Method). In 
some representations, for example, the average grey level of a small 
area of the page, termed a "pixel", is encoded (Gnyscale Encoding. See 
also 1 .4.1 .1 .2). Such a pixel is a grey dot. The number of dots per inch is 
termed the pixel resolu! This pixel resolution may range from 100 
per inch to several thousa id per inch. 

It is not unusual, for reasons of storage economy, to convert a greyscale- 
encoded image document into a binary-encoded image document of 
higher resolution at the time an image document is stored. Compression 
techniques (3.3.2) are used to achieve this. The -esultant stored image 
represents a compromise between scanning resolution, image fidelity, 
and storage space. 

The electronically-encoded sequence of Vs and O's that represent nn 
Image Document is also known as a Bitmap. 

Image Documents are generally accessed by associating an index entry, 
such as a page number, with a segment cf the Image Document. See 
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discussion following under 3.1.5.2 regarding other issues associated 
with searching and retrieving Image Documents. 

3.1.5.2 Text Document 

The text of the document only is .captured as characto representations, 
that is, each alphabetic character has a unique representation (see 
discussion above) following a standard means of encoding, such as the 
ASCII standard. With electronic digital storage, the amount of space 
taken to store a representation of a character generally takes far less 
than the amount of space taken to represent a character in image form. 
Usually, each character representation of a letter of, say, the Roman 
alphabet takes 8 bits (1 byte) of storage space. When stored in image 
form, the representation may take several orders of magnitude more 
storage space, depending upon the size of the character, the scanning 
resolution, and the degree of compression (see 3.3,2) used. See also 
3.3,4.2. 

Storing a document as a text document facilitates full-text or partial- 
text retrieval (see 3,4.2), where documents or parts of documents can be 
selected and retrieved by searching for the occurrence of keywords or 
strings of text. This is not possible with Image Documents (3.1,5,1), 
unless they ha^'e been wholly or partially converted to Text Documents 
using Optical Character Recognition (OCR) techniques (3.2.4, 3,2.5), a 
process that is not sufficiently accurate for most preservation purposes 
(see, however, 3.2.4 for a discussion of the use of such techniques for the 
construction of indices). 

3.1.5.2.1 Unformatted Text 

The character representation of the text contains no information 
to indicate font style, font size, or page layout. In this sense, 
unformatted character text representations are an example of 
irreversible compression (see 3.3.2.3). 

3.1.5.2.2 Formatted Text 

The character representation of the text also contains sufficient 
information to describe one or more of font type, font size, or 
page layout. In this sense, formatted text may, if the document 
segment contains only textual material, represent a form of 
reversible compression (see 3.3.2.2). 

3.1.5.3 Compound Document 

The document is captured as a combination of image and formatted or 
unformatted text. 

3. 1 .6. Rekeying of Text 

Rchying of Text refers to a preservation techriology where the 
text in a document is literally reentered by hand into a 
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composition or other device for republication or reproduction 
purposes, often with the use of a digital computer. See also 

3.t*6.1 Unformatted Text 

. In the rekeying of the text, no attempt is made to key sufficient 
information to indicate font style, font size, or page layout. 

3.1*6.2 Formatted Text 

In the rekeying of text, iniormation is captured to indicate one or more 
of font style, font size, or page layout. 

3. 1 .7. Reprinting or Republication 

The document is preserved by producing a new edition or 
reprint, possibly by reprinting from retained intermediate forms 
of the document, such as reprinting a book from 
photocomposition tapes. Alternatively, the document may be 
recreated from scratch. 
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3.2. Capture Technology 

Capture Technology refers to the technology used to transform 
the images or information contained in the original document 
intc so** e other form, the form dependent upon the overall 
media conversion technology being used. This term is not 
relevant to Conservation (3.1.1) or Deacidification (3.1.2), v^rhich 
are conservation technologies, and do not employ media 
conversion techniques. Printing (see 1.1.1) on paper, is of course 
also a capture technology. 
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3.2.1. Photocopier 



A Photocopier is a dr^vice for making photographic copies of 
graphic images. A common form of the photocopier involves 
the use of the xerographic process, where light reflected from the 
original document is focused onto an electrically charged 
insulated photoconductor, and the latent image is developed 
using a resinous powder. For ^he purposes of this Glossary, the 
term photocopier is restricted to devices that use analog 
technologies, such as the use of light lens techuology. Digital 
technologies are incorpor;^ted separately (see 3.2.3). With 
photocopiers so defined, the image is normally scanned and 
printed essentially in a single operation, and an intermediate 
scanned latent image is not normally stored for re-use at a later 
stage — although the two sta^e processes of photography, which 
indeed may be used for photocopying, do permit the use of the 
photographic negative as an interm'^diate storage device (a 
particular case of which is the use of microform recording 
technology — see 3.2.2). 



A Microform Recorder is a camera or other photographic device 
for photographing the original document and printing it onto 
one of several forms of microform (1.1.2). The microform film 
in essence becomes both a storage medium (see 3.3.1.2) and a 
presentation medium (see 3.6.1.2 and 3.6.2.1). Other film copies 
and paper copies may also be made from the microform 
negatives for presentation (see 3.6.1.2). 



A Digital Image Scanner is a device for scanning the images 
contained on pages of a document and transforming the scanned 
image into digital electronic signals corresponding to the 
physical state at each part of the search area, that is, into image 
documents (3.1.5.P These signals are most often stored (see 3.3) 
for subsequent interpretation (see 3.2.5, 3.2.6, 3.2.7, and 3.3.2, 
3.3.4), access (3.4), distribution (3.5), or presentation (3.6). A single 
small element of the document (known as a "pixel**) is thus 
encoded quantitatively by a digital number, where the number 
contains sufficient information to represent the image content of 
the pixel (see 3.1.5.1). A digital image scanner on its own does 
not interpret the image information. The number of pixels per 
square inch is considered to be the resolution of the scanner. 
Typical resolutions with current technology range from 100 
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pixels per linear inch to over 1,000 pixels per linear inch, but 
there are trade-offs between resolution, speed, cost, and quality. 



Digital Image Scanners may scan in one or more different 
modes, depending upon their capability and depending upon 
whether they are scanning monotone or color (1.4.1), or whether 
they are scanning line art, greyscale, halftone, or continuous 
tone objects (1.4.1.3, 3.1.5.1). Performance, in terms of speed, ' 
accuracy, and resolution depend upon the degree to which these 
attributes can be accommodated. The speed of digital image 
scanners range from one or two pages per minute to around fifty 
per minute. 

A FAX machine (3.5.3) is a special form of digital image scanner. 
Other special forms of digital image scanners exist for scanning 
from media other than paper, such as digital image scanners that 
scan directly from microfilm (1.1.2). Such images scanned from 
microfilm, however, can be no better than the original 
microfilm image itself (see 3.1.4). 

Digital image scanners may come equipped with different 
physical devices for accommodating the original documents. 
These may include flatbed platens equipped with manual feeds, 
semi-automatic feeds {oue page at a time is fed into an automatic 
hopper), or fully-automatic feeds. Manual feeds offer the greatest 
safety from potential jamming, a point of importance in the 
scanning of unique documents. Flatbed scanners generally 
require either books to be disbound and one page at a time placed 
on the platen, or require books to be laid open face-down on the 
platen, which may cause some distortion. They may also come 
equipped with edge-scanners, which scan right up to the binding 
of the book, avoiding this distortion; or with cradle scanners, 
where the book is opened in a cradle (such devices are also used 
in some microform recording devices) and two angled scanning 
heads are lowered into the open, cradled book. In all cases, 
quality control of scanning is an issue with respect to fidelity of 
the scanned image and registration of the scanned image with 
respect to a defined standard. 



An Optical Character Recognition (OCR) Scanner is a digital 
image scanner that in addition interprets the textual portion of 
the images and converts it to digital codes representing 
formatted or unformatted text (3.1.5.2). The less sophisticated 
surh devices can only "recognize" one or a few fonts of a fixed 
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size, and can only interpret such information as unformatted 
text. The more sophisticated devices can represent multiple fonts 
of different sizes, and can interpret limited information as 
formatted text. At either extreme, no device achieves 100% 
recognition accuracy: accuracy of the better devices typically 
ranges between 95% and 98%, depending upon manufacturer 
imposed trade-offs between the sophistication of the device, its 
speed, and its intended range of applicability. 

OCR devices are most often used where scanning errors and 
unformatted text are acceptable limitations, such as, for example, 
where the input material can be subse4uently proofread and 
corrected, or where redundant information is scanned and the 
redundant information used to correct any inconsistencies 
arising from scanning errors (typically in certain commercial 
applications). In i:he context of document preservation, most 
uses of OCR devices are limited to where text information only 
suffices, and the form of the original document is not an 
important aspeo' of preservation. An important application is 
for use in the construction of indices for access and distribution 
(see 3.4 and 3.5), or for full contextual searching of information 

(3.4.2) . Promising research has been done, for example, on the 
searching and retrieval of documents for retrieval purpo:>es 
using the "corrupted" (erroneous) text derived from the OCR 
scanning of documents. The techniques utilized in this approach 
exploit the redundant information contained in the corrupted 
text. 

Handwriting recognition devices, an extreme form of OCR 
devices, are not included in this Glossary. At this time, such 
devices are limited in capability. 

3.2.5. Internal Character Recognition 

Inicrml Character Recognition is ihe term sometimes used 
when the same interpretation technology that is used in OCR 
devices (3.2.4) is applied to an already stored digital image at a 
later date. This separates the functions of scanning- the images 

(3.2.3) digitally, and of interpreting the images. Im.rpreting the 
scanned and stored images at a later date also allows for using 
different recognition technologies in the tradeoffs between 
accuracy, speed, and function, in the context of preservation and 
media conversion, it also allows for the immediate focus to be 
placed on scanning and storage (and possibly media conversion), 
deferring the option of character recognition and its applications 
(see 3.2.4) to a later date — at such time, massive-volume 
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character recognition and information interpretation is likely to 
be more economically feasible at higher levels of accuracy than 
with present technology. 

3-2.6. Intelligent Character R e coenition 

luteUigent Character Recognition is the term sometimes given 
to Optical or Internal Character Recognition where the scanned 
and recognized information is further interpreted to take 
advantage of contextual information, that is, words, phrases, and 
so forth, rather than simply treating the text as a string of 
independent characters. Intelligent Character Recognition, for 
example, may be used by sophisticated computer programs to 
construct concordances automatically, or to create highly- 
sophisticated indexes. At this stage, intelligent character 
recognition is a field of research, rather than production, 
interest. 

3.2.7. Page Recog nition 

Page Recognition is the term given to the automatic 
interpretation of features contained within the printed page such 
as titles, subheads, columns, paragraphs, figures, figure captions, 
footnotes, and so forth. Additional capabilities of sophisticated 
page recognition algorithms include the ability to determine 
fonts and font sizes. In essence. Page Recognition "reverse 
engineers" the image into marked-up copy. 

3-2.8. Rckeying of Text 

As an alternative or complement to OCR (3.2.4), textual 
information can be encoded by directly keying alpha-numeric 
text into computer files manually. This has some advantage in 
accuracy over OCR, but is slower. It may alio be used in 
situations where the brittleness of acidic documents makes them 
so fragile that scanning technologies cannot safely be used See 
also 3.1.6. 

3.2.9. Enhancement 

Enhancement refers to the use of mathematical algorithms to 
improve the quality of digitally scanned images (3.2.3), such as by 
computationally adjusting the contrast or brightness of the 
scanned image. The term also includes techniques that may be 
used to modify the scanned image for structural reasons, such as 
bordering to remove any unwanted scanned areas surrounding 
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the actual document pages, de'Skezoing to rectify the scanned 
image to correct for any <,kew in the placement of the document 
on the scanner, or margin adjustment to ensure that pages are 
properly uligned with each other. 

A full glossary of terms associated with enhancement is beyond 
the scope of this document. 
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33. Storag e Technolog y 

Storage Technology refers to the technology used to store the 
images or information obtained through the use of some form 
of Capture Technology (3.2). This includes the medium used for 
storage (33*1), the compression methodology used to minimiz«e 
the amount of storage medium employed (3.3.2), the format 
used to program the image or information onto the medium 
(3.33), the encodin'^ methods used to represent any 
interpretation of the stored information (33*4), and the useful 
life of the storage medium (33.5). 

3.3.1. Storage M iium 

3.3.1.1 Paper (see 1.1.1) 

3.3.1.2 Microform (see 1.1.2) 

3.3.1.3 Video (see 1.1.3) 

3.3.1.4 Film (see 1.1.4) 

3.3.1.5 Audio (see 1.1.5) 

3.3.1.6 Digital Electronic 

A family of storage devices where information or data are represented 
by a series of quantized changes to the surface of the storage medium, 
where such quanta are recorded or modified using electrc^ic means. 
There are two main classes in this category: magnetic devie. where, ir. 
recording, the magnetic state of a coated surface is altered by the 
electronic digital signal, and, in reading, the surface is sensed Msing 
reading heads conceptually similar to those used in common .ape 
recorders; and optical devices where the optical properties of a coated 
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surface are altered (in one such technology, submicromcter-sized holes 
are recorded and read by laser beams focused by electronic means onto 
the area of the spot). The recorded quanta normally corresponds to a 
recorded "1" or a recorded "0", that is, of hits (derived from ''binary 
dibits**), all data and information being constmcted from these basic 
building blocks. 

Such devices are further classified according to whether they are 
rc(td/write devices (that is, information may be written onto the device 
and read from the device, and the information can be modified as many 
times as desired), read c>n/y memory (ROM) devices (that is, 
prerecorded information can be read from the device, but the 
information cannot be modified), or xorite-oncc-read-mpny (WORM) 
devices (that is, information may be written once by the consumer onto 
the device, but thereafter it can only be read). Most optical devices are 
either read only or WORM devices, but a class of devices that combine 
both magnetic and optical technologies imaguclo-optical devices) are 
indeed read/write devices. 

Typically, magnetic devices are of higher performance in terms of 
access time to a given segment of recorded information and transfer lime 
of such accessed information to the host device. Optical devices, 
however, are generally more economic in ter-;s of storage capacity. 
Magnetic technologies have a longer history than optical technologies, 
and more is known about their usciul life, for example (see 3.3.5). Both 
technologies seem to be following similar cost /performance curves with 
performance parameters doubling in capability approximately everv 
two to three years (except for access times which are improving much 
more slowly), and cost per bit halvii.g about every two to three years. 

Both devices are further classified as to whether they are random 
access dev.res (such as disk storaj^e devices) or serial access devices 
(such as tape storage devices). With random access devices, 
information stored at any point can be directly accessed (much as is 
accomplished by placing the playing-arm of a phonograph at any point 
on the phonograph record); with serial access devices, information can 
only be accessed by passing through information that may be recorded 
ahead of it on the medium (as in winding through a tape on a tape 
recorder to arrive at a particular passage). 



A rotating circular plate having a magnetized surface on which 
information may be stored as a pattern of polarized spots on 
concentric or spiral recording tracl'5. These plates or platters 
are usually stacked in disk drives, several to a drive. These 
platters may either be removrtNe or not, although in high 
performance disk drives, the platters are usually not 
removable. They are, however, read /write devices (3.3.1.6). 
Some removable magnetic disks of lower capacity are known as 
Poppi/ disks, since originally the recording medium was made 
of a flexible plastic. 



3.3.1.6.1 
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3.3.1.6.2 



Magnetic Tap e 



A plas:i<^^ paper, or metal tape that is coated or impregnated 
with magnetizable iron oxide particles on which information is 
stored as a pattern of polarized spots. These are read using 
magnetic tape drives. Access times with magnetic tapes ai^e 
slower than those associated with corrc*';>ondirgly priced 
dis^'^. since chey are serial accesc* device.^ but the iapes are 
almost always removable so that the information can be stored 
off-line, thus making tapes^^ useful for archival storage (but 
see 3.3.5). 



A rotating circular plate e n which information is sto-ed as 
submicrometer-sized holes and i5 recorded and read by laser 
beams focused on thi. disk. This includes the class of CD-ROM 
devices, which embodies the same 5 1/4" dian^eter format used 
for CD recordings, CD-ROM's are usually read by inserting the 
CD-ROM disk into a CD-KoAl player. Other typical formats 
involve 12" or 14' diameter format.s, but there is a dearth of 
standards. The latter are usually read by inserting therr< into 
optical jukclwx devices, which perform the role suggested by 
their name. Even when mounted, ar^ess times for optical disks 
are typically relatively slow, because of the lag time needed t'' 
"spin up*' the disk. However, the cost per stored bit is 
extremely low. Error rates may also be higher than for 
magnetic technologies. As such, optical disks are most useful 
where there is an abundance of redundant information 
contained in the stored data, such as would be th,» coSe with the 
storage of scann.d document pages. On viewing the data, the 
eye would not likely be troubled by a tiny dot among an ocean of 
dots being the wrong shade of grey. See also the discission of 
magneto-optical devices (3.3.1.6.5). Conversely, magnetic 
devices excel in the recording of encoded text (see 3.3 4 2), but 
may be expensive to use for the storage of images even when 
compressed (3.3.2). 



An emerging class of technology that combines the advantages 
and disadvantages of tape (3.3.1.6.2) with those of optical 
recording technology (3.3.1.6.3). Their chief advantage may lie 
in very cheap cost per bit storage, but at this time ihey suffer 
from relatively high error rates. 



3.3.1.6.3 



Optical Disk 



3.3.1.6.4 



Optical Tape 



25 Removable disks, such as floppy disks, are ^^so used for archival storage. However, 
magnetic tapes are usually cheaper when large volumes of data are to be archived. 
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3.3.1.6.5 



Ma gneto-Optical Disk 



Disks that combine the use of magnetic and optical 
technologies. To record data, elements of the crystal structure of 
the substrat? are aligned by using a laser to heat the element in 
the presence of an applied magnetic field. When the magnetic 
field is aligned one way, a "1" is recorded; when the magnetic 
field is reversed, a "0" is recorded. The data are read by 
reflecting a lower-intensity laser beam off the surface; *^he 
polarization of the reflected light varies according to the 
crystal alignment of the element of the substrate. Unlike 
regular optical disks, m^^gncto-optical disks are read /write, 
and have performance characteristics somewhere between 
those of magnetic disks and optical disks in terms of access 
times, transfer rates, and storage capacity. 

3.3.2. Compression 

Compression refers to the extent to which the encoded form of 
the preserved or reformatted document has been modified to 
reduce the amount of stor2 ze space required by the storage 
medium. The technique takes advantage of the great redundancy 
that is present in much recorded data, particularly in image 
documents (3.1.5.1). Savings of storage of factors of ten or more 
may readily be achieved dependirg upon the scanning 
resolution and methodology employed (3.2.3), the type of 
material being scanned, and the particular compression method 
used. Although without compression the storage requirements 
grow rapidly as the squc*re of the scanning resolution (3.2.3), v/ith 
effective compression methods the storage requirements can be 
constrained to grow almost linearly with the scanning 
resolution. This is because advantage is taken of the greater data 
redundancy accruing from the increase of scanning resolution* — 
compression effectively eliminates or reduces this data 
redundancy. Thus, the greater the redundancy of information 
contained in the scanned mnterial, the more compression ib 
possible — continuous tone photographs, for example, often 
contain large amounts of redundant information. Compression 
is an important factor in the economics and efficacy of digital 
preservation. 

3.3.2.1 Uncompressed 

No compression has occurred. 
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33*2*2 Reversibly Compressed 

Compression has occurred so that the process can, if required, be 
reversed so that the original can be recovered without loss of 
information. Also knr/A'n as "lossless", 

CCnr Group Compression 

Compression standards defined by the International 
Consultative Committee for Telephony and Telegraphy 
(Comite Consultative Internationale pour la Telephonie et la 
Telegraphie). 

Reversible Textual Compression 

If sufficiently complete, the representation in whole or in part 
of documents as formatted text (3.1.5.2.2) may represent a form 
of reversible compression. The use of a markup language 
(3.3.4.3) is also a form of reversible textual compression. See 
also 3.3,4. 

Page Description Language Compression (PPL) 
See 3,3.4.4 

Other Compression Standards or Algorithms 

Refers to other compression standards, de facto standards, or 
c'.gorithms. 

3.3.2.3 Irreversibly Compressed 

Compression has occurred so that the process cannot be precisely 
reversed. The original cannot be recovered without loss of information. 

3.3.2.3.1 Irreversible Textual Compressio n 

The representation in whole or in part of a document as 
unformatted or partially formatted text (3.1.5.2) may represent 
a form of irreversible compression. The content of the text may 
be obtained but not one or more of its font style, font size, or 
positioning on the page. 

3.3.3. Storage Format 

As used in information storage and retrieval. Format or Storagr 
Fonnat refers to the actual representation of the stored data on 
the storage medium, that is, the specific way in which it is 
encoded or programmed onto the medium. Classifying such 
methodologies is beyond the scope of this document. Indeed, for 
the most part — and particularly as applied to digital electronic 



3.3.2.2.1 



3.3.2.2.2 



3.3.2.2.3 
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Storage technologies — there are few general standards that are 
accepted by all or most manufacturers. The implication is that 
access to the information stored on the medium depends upon 
specific software or computer programs supplied by the 
manufacture r, software that may become obsolete with che 
passage of time. One result may be that stored information may 
noed to be reformatted or transferred to newer storage media 
periodically in order for the information to remain accessible 
with current software and technology. 

3.3.4. Encoding Method 

Encoding Method refers to the extent to which the information 
content of the document has been interpreted and encoded, 
rather than merely recorded. Such interpretation may be 
beneficial for a number of reasons including as a means of 
achieving reversible compression (3.3.2.2); for the construction 
of document indices to facilitate searching and access (3.4.1); or 
for efficient distribution of the information across data networks 
(3.5.5). For example, a document that has been merely scanned as 
a bit-mapped image (3.1.5.1) has not been encoded (3.3.4.1), even 
though faithful "digital pictures" of the pages of the document 
have been obtained. If the images of the document text are later 
interpreted through internal character recognition (3.2.5), then 
the digital representation has been textually encoded (3.3.4.2). 

3*3,4,1 No Encoding 

No interpretation of the information contained in the originn! document 
has occurred. If the document were originally scanned using a digital 
image scanner (3.2.3), then the document in this instance is generally 
stoied in some image format (3.1.5.1/, compressed or not (3.3.2). If 
portions of the document were originally scanned using optical 
character recognition (3.2.4), then those portions will be stored as 
either formatted or unformatted text (3.1.5.2). 

3.3.^.2 Textual Encoding 

The text contained i.j the original document has been interpreted so 
that each character has a separate representation (see 3.1.5.2). Such 
interpretation may have occurred at the time of scanning if an optical 
character recognition device is used (3.2.4), or later using internal 
character recognition (3.2.5) programs applied to documents in image 
format (3.1.5.1). Such textual interpretation may result in either 
u ^formatted or formatted text, depending upon the degre ^ of 
sophistication of the device or program. Recognition accuracy may also 
be limited. 
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3.3.4.3 Markup Language Encoding 



A computer markup language is a means for describing, for an 
electronically stored document, the complete positioning, format, and 
style of text and image segment representations (3.1.5) within the 
document. When combined with textual represent:;iion, it is a means for 
achieving fully formatted text (3.1.5.2.1). When combined with 
relevant image information about document graphics material (if any), 
it may be a means of archiving fully reversible compression G.3.2.2) of 
the document. An example of a markup language is SGML (Standard 
Generalized Marh iy Language) that has been adopted by the United 
States Governmc i.t and by many pubiishers as a pseudo-standard. 

3.3.4.4 Page Description Language Encoding 

A computer language ir .vhich segments of text and images are 
economically described with respect to form, orientation, size, density, 
and other characteristics for purposes of economic fransmission across 
networks and between host devices and output devices such as printers. 
Page Description Languages a^-e another form of compression (3.3.2), as 
well as a form of encoding. 

3.3.5. Useful Life 



Useful Life refers to the archival quality of the storage medium. 
It usually refers to the period of time during v^hich chere is no 
unacceptable loss of information stored on the medium; and 
during which the storage medium remains usable for its 
intended purpose. 

The longevity of paper varies considerably depending upon its 
method of manufacture and conditions of storage (see 1.5). 
Unless Ihe paper is produced to meet permanent standards 
(1.5.1), paper may last from a few years or so to hundreds of 
years. Most paper produced since the middle of the nineteenth 
century has a useful life of less than 100 years. Paper produced to 
meet archival standards should last several hundred years. Film, 
provided it is manufactured, processed, and stored according to 
archival standards, appears to have a useful life well in excess of 
500 years. Videotape appears to be extremely vulnerable and to 
have a relatively short life of a few decades. 

Digital electronic storage media have a varying useful life 
projected to range from a few years to over 100 years. The latter 
has not been formally tested by experience, but is projected based 
on laboratory stress tests. Such media, however, become obsolete 
for other reasons long before their physical properties render 
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them useless (see, for example, 3.3.3). It becomes economically 
and functionally infeasible to maintain the information stored 
on the original medium of capture, since it becomes far cheaper 
to transfer the information periodically to higher density and 
cheaper newer technologies. Concerns also exist r^earding the 
possibility of modifying digitally-encoded documt/cts, 
particularly when * read /write" (3.3,1.6) devices ^ used (this is 
essentially not possible with "read only" or "write once, read 
many" technologies (3.3.1.6)); and regarding other issues of 
security. 

The implications of periodic recopying for libraries are qu'te far- 
reaching. Libraries are not used to having to maintain their 
inventory by periodic recopying, even though such practices are 
quite common in data centers. Indeed, the recent impetus of 
preservation may have caused some librarians to rethink their 
position in this regard, although librarians still tend to think in 
terms of periods of centuries rather than having (or wanting) to 
recopy every few years. Such considerations may either hinder 
the adoption of digital technologies or eventually cause some 
rethinking of the underlying economics of librarianship. 

Further implications are discussed in the Introduction. 
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3.4. Access Methodology or Technology 

Access Methodology or Technology refers to the means of 
selecting information from among all the information that is 
stored. 
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3.4.1. Indexed Access 



A Document Index is a systematically ordered file of objects^^ 
that refer to a collection of documents or to specific parts of those 
documents, organized in such a way as to facilitate searching the 
document collection for purposes of selection of single 
documents or groups of documents contained in the collection. 
Such document indices may be stored on different media 
depending upon how they are to be used. 

3.4.1.1 Via Catalog 



Access via a file of bibliographic records, created according to specific 
and uniform principles of construction and under the control of an 
autlwriti/ file, which describes the documents contained in a collection. 
The file is usually organized in a systematic manner to facilitate access 
and document selection. Catalogs historically have been implemented 
in card files, but increasingly such card files are retroactively and 
prospectively giving way to computerized data files (1.2.10) which 
may be accessed and searched by patrons with the use of computer 
workstations (3.6.2.6) and data networks (3.5.5). Such computer-based 
catalogs are increasing in sophistication to support complex queries, 
including Boolean queries, which support logical searching (e.g., all 
the works of fictic written in Albania published between 1890 and 
1919 by authors whose last name begins with the letter "L"). 

3.4.1.2 Via Abstract 



Access via a summary of the document. Most often, the summary is of a 
contribution to a journal (1.2.6) or other periodical (1.3.2). Such a 
summary is usually without interpretation or criticism, and may 
contain a bibliographic reference (or pointer) to the original document. 
A collection of document abstracts may be used for purposes of search 
and selection (e.g.. Chemical Abstracts, published by the American 
Chemical Society and also available in digital electronic form). 



3.4.1.3 Via Table of Contents 



Access via a list of parts contained in a document, such as chapter titles 
or articles in a periodical, with references by page number or other 
locator to the starting point of the particular part, usually ordered by 
sequenced groupings of the order of appearance. Collections of tables of 
contents may also be used for search and selection purposes. 



26 See Footnote 13. 
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Other parts of documents that may be used for search and 
selection purposes include: 



3.4.1.4 Via List of Figures, Tables, Maps or Other Illustrations 

Access via a li^t oflhose parts of a document that are either figures, 
tables, maps or other illustrations, respectively, with location 
reference by page number or other locator, usually ordered by location of 
appearance within the document. Figures, tables, maps, etc. may be 
listed separately. Usually, in a document, these lists follow the Table 
of Contents in some order. 

3.4.1.5 Via Preface 

Access via a note preceding the body of a docun- ent that usually states 
the origin, purposes, and scope of the work(s) coi tained in the document 
and may include acknowledgements of assistance. When written by 
someone other than the author(s) of the document, the preface is more 
properly termed a foreword. 

3.4.1.6 Via Introduction 

Access via the material that heads the body of a document and that 
provides an overview of the work that follows, or other introductory 
material to the text. 

3.4.1.7 Via Index 

Access via a systematically ordered collection of words or other terms 
or objects27 contained within a document, with references by page 
number or other locator to the placement of the object within the 
document for purposes of accessing the object. The index is usually 
placed last in a document. 

3.4.1.8 Via Citation 

Acress via reference to a document or to a part of a document, such as an 
article in a journal (1.2.6). A bibliography a collection of citations 
directed to a specific purpose, such as a subject bibliography or a 
bibliography of citations appended to a journal article. 

3.4.2. Full (or Partial) Document Access 

Full Document or full text searching is where the full text of a 
collection of documents is stored, and the entire text of all or 
portions of the documents is searched for specific character 
strings, usually combined with some Boolean logical .-earching 

27 See Footnote 13. 
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capabilities. This requires that the documer* be textually 
encoded (3.3.4.2) either because it was initiaiiy created that way or 
perhaps more likely in the context of preservation because such 
textual encoding was obtained from scanned document images 
(3.1.5.1) with internal character recognition (3.2.5). Thus, for 
example, a search may consist of searching for all documents in 
the collection published by a given author or set of authors 
between certain dates containing the text "all that glitters." Full 
text searching is normally implemented on computers. For other 
th^n small collections of documents, a given search may be very 
costly in terms of computer processing time. 

3.4.2.1 Via Inverted Text File Index 

The use of Inverted Text Files (or other similar techniques) is often used 
as a 'ompromise between indexed and full text searching. A file of 
words (Keyword), phrases (Key Phrase), or other text objects contained 
in a given collection of stored documents is created from an initial 
analysis of the full text together with locators a.s to where all instances 
of the word; phrase, or other object can be found within the file. In use, 
instead of the full text being searched for all occurrences of the object,^^ 
the inverted file itself efficiently gives pointers to the locations. The 
construction of such an inverted file, however, nnay oc expensive for 
large collections of documents, as would adding new words or other 
objects^^ to the file at a later date. Furthermoie the use of the file is 
only as good as the care that has been given to the choice of objects to be 
contained within the file. 

3.43. Compound Document Access 

Compound documents are documents that contain both 
textually and other forms of encoded information, including 
image (see 3.3.4). Techniques are being developed for expanding 
the concept of text searching to searching of full compound 
documents, including those containing image objects^^. A full 
glossary of such techniques, hov^ever, is premature and beyond 
the scope of this document. 



28 See Footnote 13. 

29 See Footnote 13. 

30 See Footnote 13. 
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3*5. Distribution Technolog y 

Distribution Technology refers to the technology used i.> 
distribute or deliver the stored encoded document from one 
point to another* Some form of delivery service may be used 
(3.5.2), or, if the medium is paper, it may be distributed using 
point-to-point or distributed FAX (3.5.3). On the other hand, if 
the medium is digital electronic, then either the document may 
be converted to paper, by ''printing-on-demand'' (3.5.4) and 
subsequently distributed using delivery services or FAX, or data 
networks (3.5.5) may be used for distribution to a computer 
workstation (3.6.2) , possibly to be converted to another medium, 
such as paper, at the point of delivery (see 3.6.1). 

3.5.1. Distribution Medium 

The D/V ibution Medium is the medium used to transport the 
stored encoded document to the presentation or viewing device 
(3.6.2). The same media that can be used for original documents 
(1.1) can also be used as distribution media. 

3.5.1.1 Paper (see M.l) 

3.5.1.2 Microform (see 1.1.2) 

3.5.1.3 Video (see M.3) 

3.5.1.4 Film (see 1.1.4) 

3.5.1.5 Audio (see 1.13) 

3.5.1.6 Digital Electronic (see M.6) 

Whichever technology is used for storage (3.3.1), digital technologies 
may usually be used as the medium of distribution, as contrasted with 
using delivery services (3.5.2) to deliver the document. Paper, for 
example, can be scanned and transmitted by FAX (3.5.3) or across data 
networks (3.5.5). The only exception to this at this time is video, which 
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is normally distributed by analc^ electronic distribution networks (as 
opposed to digital — see 1,1.6), because of the high information 
capacity ibnndividth) required. As the bandwidth of data networks 
grows, however, it is anticipated by many technologists that analog 
transmission will yield to digital transmission even for video 
recordings. Films, too, are often transmitted by converting them to video 
recordings (with some loss of quality at this time), and transmitting 
them across analog video networks. 

3.5.2. Messenger Services 

Messenger Services refers to the use of local, regional or 
national messengering or mail services to hand-deliver 
documents from the point of inventory or storage to the patron 
or consumer. One special case of this includes the patrons 
performing the messengering services for themselves by 
viev^ing the document, or by directly acquiring it (purchasing or 
borrowing), at or from the location of the document's storage. 

3.5.3. FAX 



FAX or Facsimile Transmission is a system of communication or 
delivery for paper documents or other graphics material in 
which a special digital image scanner (3.2.3) scans the pages of 
the document, compresses the scanned image using CCITT 
Group Compression (3.3.2.2.1), and transmits the digital signals 
by wire or radio to a FAX receiver at a remote point. The FAX 
receiver decompresses the signals received and prints the digital 
image on paper. FAX transmission is a point-to-point protocol 
t':at is normally conducted over voice (3.5.6) or data (3.5.5) 
networks. Usually, scanning and printing devices are relatively 
slow (about 5 pages per minute), and the quality is limited. The 
popularity of FAX rests on its simplicity of use and the relatively 
low cost of the equipment. With the rapid growth of installed 
FAX equipment, FAX has recently been extensively used for 
inter-library loan purposes, and is also becoming used for intra- 
campus delivery purposes. 

3.5.4. Print-on-Demand 

Print-on-Demand refers to the capability to print documents 
right at the time they are required by patrons and consumers, 
rather than following traditional norms of printing documents 
in advance of need and coping with the need to distribute and 
inventory printed documents in anticipation of demand. This 
approach to distribution mirrors the "just-in-time" approach to 
inventory control. Print-on-Demand techniques are normally 
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used in conjunction with digitally stored documents (3.3.1.6) and 
data networks (3.5.5). The approach offers the promise of closing 
the gap between thp world of digital technologies and those who 
maintain the superiority or simply prefer the characteiistics of 
paper documents. Documents may be printed right in the 
patron's office or at a shared local facility from where it is 
delivered to or picked up by the patron. 

3.5.5. Data Networks-^^ 



A Data Nehvork is a communications network that transports 
data between and among computers and computer workstations 
{nehvork nodes). Such networks may depend upon different 
physical media to transport the encoded digital signals (twisted 
pair copper wire, coaxial cable, fiber optic cable, satellite, and so 
forth); different protocols to encode the signals; and different 
ways in which the encoded signals are interpreted for use in 
applications. They also include bridges, routers, and gateways for 
connecting different media and for translating one protocol into 
another. Data networks vary considerably in speed and capacity, 
depending upon the physical media, the protocols used, and the 
particular architecture of the network. Network speeds and other 
performance characteristics a]?pear to be more than doubling 
every two to three years. 

3.5.5.1 Local Area Network 

A Local Area Network (LAN) is a data network used to connect nodes 
that are geographically close, usually within the same building. In a 
wider view of a local area network, multiple local area networks are 
interc^ -erected in a geographically compact area (such as a university 
camp*. usually by attaching the LANs to a higher-speed local 
hncklwnc. 



3.5.5.2 Wide Area Network 

A Wide Area Netuwk (WAN) is a data network connecting large 
numbers of nodes and LANs that are geographically remote, such as 
within a broad metropolitan area, or between widely-separated 
metropolitan areas. This would also include rcf^tonal networks, such as 
NYSERNet, which i*>terconnects research and educational institutions 
in New York State. 



31 The Technical Assessment Advisory Committee of the Commission for Preservation and 
Access is preparing a report on the implications of data networks. 
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3.5.5.3 National Netv;ork 



A WAN, or a federation of interconnected WANs, that span the nav.on, 
such as the NSFNet, BlTNet, CSNel, CREN, and, more generally, the 
Inter/Nit and the anticipated NREN (National Research and 
Educational Network). These national networks often use a high-speed 
spanning national backbone to interconnect regional WANs, Protocols 
are established to facilitate routing of information across the national 
networks to users at connected nodes. The national networks often have 
international connections and outreach. 



?.5.6. Voice Networks 



Voice Netioorks are local, national, or international networks 
used to carry voice or telephone traffic. They may be either 
analog or digital (see 1.1.6). Because of different technical 
requirements, the transmission of data and voice usually is 
conducted using different transmission protocols, although it is 
increasingly common to share the same wiring plant. In general, 
there is increasing integration between the voice and data 
milieus. 



3.5.7. Cable Networks 



Cable Netxvorks are local, regional, or national networks 
normally used for the transmission of analog (see 1.1.6; signals 
such as video (see 1.1.3) television signals. 
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3.6. Presentation Technology 

Presentation Technology is the term given to technologies that 
present the encoded document to the end user or patron, 
possibly following some conversion of one medium to another. 
If the storage medium is paper, for example, no conversion 
would be necessary, and the storage medium and the 
presentation medium are one and the same (unless the 
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distribution technology used were, say, FAX, in which case there 
are intervening conversion processes). If the storage medium, 
on the other hand, were digital electronic (3.3.1.6), for example, 
and data networks (3.5.5) were used as the means of ci;3tribution, 
then the presentation technology might be a computer 
workstation (3.6.2.<i) or the distributed encoded document could 
be converted to some other form such as paper. 

3.6-1. Presentation Medium 

The presentation medium is the medium into which the stored 
document (3.3), which has bsen distributed over the distribution 
medium (3.5.1), is converted to facilitate viev/ing or reading by 
the end user. 

3.6.1.1 Paper (see 1.1.1) 

3.6.1.2 Mi<Tofonn V ec 1.1.2) 

3.6.1.3 Video (see 1.1.3) 

3.6.1.4 Film (see 1.1.4) 

3.6.1.5 Audio (see 1.1.5) 

3*6.1.6 Digital Electronic (see 1.1.6) 

3-6.2. Presentation or Viewing Device 

A Presentation or Vieioing Device converts the distribution 
medium (3.5.1) into the presentation medium (3.6.1). This 
includes the class of computer workstations (3.6.2.6). 

3.6.2.1 Paper Document 

A paper diKumenl, SMch as a book, must itseif be considered n viewing 
device in this context when the presentation medium is paper (3.6.1.1). 
See 1.2 for a classification of different formats for paper . ocuments. 

3.6.2.2 Microform Reader 

A display device with a built-in screen and magnification so that a 
microform (1.1.2) can be read comfortably at normnl reading distances. 
Such devices may be accompanied by microfom nrhitcrs that can 
produce full-size (generally low-quality) paper copies of the 
microforms. 

3.6.2.3 Video Projector (Television Set) 

A device used to project or play back videotapes (1.1.3 c^d 3.6 1.3) onto 
a television screen. Normally this is accomplished through the use of a 
videorecorder (see below) and tclr^isiott set or television projection 
sptem. However, it is becoming increasingly common to play the video 
back through a computer workstation (3,o.2.6), possibly converting the 
analog signal to digital form (1.1.6). 
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The term videorccordcr is often used to denote a devicr capable of both 
recording live television sign. 's onto videotape and for reading 
recorded videotapes and transmitting the signal to a video projector or 
television set. 

3.6.2.4 film. Slide, or Other Projectors 

A uevice to project motion picture films (1.1.4), still photographic 
slides (1.2.9.3), or other graphic materials (1.2.9) onto a screen, and. 
with some device, lo reproduce sound from the film soundtrack. Slid** 
viewers enable the user to view the slides through background 
projection on a small screen. Other classes of projectors (such as 
iwerhcad projector^) are designed to project images recorded on 
transparencies onto a screen. 

3.6.2.5 Audio Devices 

A device capable of playing back audio documents (1.1.5) such as 
phonograph record players, CD players, and tape cassette piayers. 

3.6.2.6 Computer Workstation 

A aevicc capable of supporti* »he creation, storag-j, access, 
distribution, or presentatioi digital electronic documen^.s (1.1.6), 
ranging from special purpose devices such as electronic typewriters 
through microcomputers to high-performance engineering or desktop 
publishing workstations or even large mainframe computers. They may 
vary considerably in performance, as typically measured by the 
computer's internal processing speed, storage capacity, and ability to 
move data between its various devices. The traditional distinction 
between a persiwal computer (PC) md n lit^h-perforwancc workslaticu 
is blurring, and the term workstaiion is generically used to cover both. 

3.6.2.6.1 Display Monitor 

That portii>n of a computer wt^rkstation used to view digital 
electionic documents. This may consist of a display module built 
into the computer or it may be physically separated from the 
computer, but attached by cable. Display monitors may be 
black-and-white (1.4.1.1.1), greyscale (1.4.1.1.2), or color 
(1.4.1.4;. They may also come in varying physical sizes 
typically ranging from about 8" on the diagonal to 23" or more. 
They may also d* >play with varying resolution, with Uie 
higher (but not highest) performance monitors capable of 
displaying over 1,000 x 1,000 pixels (spcts). 



68 

The Preserved Copy: Presentation Technology 



Page 59 



3.6.2.6.2 



Local Printer 



A device locally attached to a computer workstation ca »able of 
printing digital electronic documents stored in the computer 
(3.3.1.6) or distributed to the computer from across a data 
network (3.5.5). Such devices may utilize a range of 
technologies including itupact printing, ink-jet printing, 
thermal printing and laser printing. They may print at varying 
speeds ranging from 10 characters per second to some tens of 
pages per minute. They may print with resolutions varying 
from several dots per linear inch to several hundred dots per 
linear inch. They may print in black-and-white, greyscale, or 
color. 

3.6.2.6.3 Remote Printer 

A printer (3.6.2.6.2) that is accessible to a computer workstai." m 
remotely across a data network (3.5.1.6). These may typically 
be higher performance devices than local printers, particularly 
regarding speed or resolution. Such devices are typically 
shared among many uses and users. They may have special 
capabilities for "finishing" documents. 

3.6.2.6.4 Other Local Media Output Devices 

Computers capable of supporting multi-media (3.6.2.7) may 
support other "presentation" devices, such as television 
monitors for video recordings (although the trend is to combine 
the television video monitor and the computer display monitor 
into a single "head"), and audio playback devices for scund 
signals, including connections to "hi-fi" stereo equipment. 

3.to.2.7 Multi-Media Workstation 

A computer workstation (3.6.2.6) capable of supporting and combining 
multiple media such as digifr.l electronic, video, sound, and paper 
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